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| ■ In this paper we give a new example of duality between fragmen- 

' tation and coagulation operators. Consider the space of partitions of 

p ^ mass (i.e., decreasing sequences of nonnegative real numbers whose 

sum is 1) and the two-parameter family of Poisson-Dirichlet distribu- 
| tions PD(a,#) that take values in this space. We introduce families of 

d ■ random fragmentation and coagulation operators Frag a and Coag a e , 

C ' respectively, with the following property: if the input to Frag a has 

i i PD(a,#) distribution, then the output has PD(a,# + 1) distribution, 

while the reverse is true for Coag a g . This result may be proved us- 
C^) ' ing a subordinator representation and it provides a companion set 

^ , of relations to those of Pitman between PD(a,#) and PD(a/3,#). 

' J ■ Repeated application of the Frag Q operators gives rise to a family 

' of fragmentation chains. We show that these Markov chains can be 

encoded naturally by certain random recursive trees, and use this 
^— ^ ' representation to give an alternative and more concrete proof of the 

1^-^ , coagulation-fragmentation duality. 

o ; 

■ 1. Introduction. The subject of this paper is a duality relations for a 

fragmentation operator and a coagulation operator when applied to certain 
£h Poisson-Dirichlet distributions. The idea of duality by time reversal for frag- 

mentation and coagulation is very natural: the opposite of splitting blocks 
• i-H ! apart is coalescing them. However, demonstrating duality for coagulation 

^ ■ and fragmentation processes with desirable properties seems to be a diffi- 

^ . cult problem and there is no general theory. There are, however, several 
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beautiful examples where some form of duality does hold; for instance, the 
additive coalescent of Aldous and Pitman [4] and the Bolthausen-Sznitman 
[10] coalescent, whose duality properties were discovered by Pitman [18] (see 
also the discussion in Chapter 5 of [19]). 

We work on the space of partitions of mass (i.e., decreasing sequences 
of nonnegative real numbers whose sum is 1). Fix < a < 1 and 8 > —a. 
Our fragmentation operator takes a size-biased pick from the sequence and 
splits the chosen block with a PD(a, 1 — a) random variable. Our coagula- 
tion operator generates a Beta((l — a) /a, {9 + a) /a) random variable and 
coalesces that proportion of the blocks. If the input to the fragmentation 
operator has PD(a,#) distribution, then its output has PD(a,#+ 1) distri- 
bution. Moreover, an application of the coagulation operator allows us to go 
back the other way. This extends a result of Bertoin and Goldschmidt [7], 
which covered the a = case. It also provides a companion set of relations 
to those of [18] for the Poisson-Dirichlet distributions. 

Building a Markov process using the fragmentation operator gives a self- 
similar fragmentation process of index 1, dislocation measure PD(a, 1 — a) 
and erosion coefficient 0, in the terminology of Bertoin [6]. We show that 
this fragmentation process is naturally embedded in certain random recur- 
sive trees (i.e., rooted labeled trees whose labels increase along paths away 
from the root). These (a, 6) -recursive trees can be viewed as nested systems 
of exchangeable partitions, and their construction is an elaboration of the 
"Chinese restaurant process" of Dubins and Pitman (see [19]). They can also 
be regarded as examples of graphs constructed by preferential attachment 
[9, 15, 21], and our results complement those which have previously been 
obtained in special cases, for example, a = 1/2, 9 = [23] or a = 1/2, 9 = 1/2 
[14]. 

In Section 2 we collect various definitions and results concerning the fam- 
ily of Poisson-Dirichlet distributions. The fragmentation and coagulation 
operators are defined precisely in Section 3 and the duality relationship be- 
tween them is proved using a subordinator representation. The extension to 
fragmentation and coagulation processes is described in Section 4. In Sec- 
tion 5 we introduce the random recursive tree model and describe how it 
encodes the fragmentation process. In Section 6 we use this representation to 
give an alternative and more concrete proof of the duality between the frag- 
mentation and coagulation processes. Finally, in Section 7, we comment on 
relationships between the recursive tree model and previous representations 
of the Chinese restaurant process in terms of continuous-time branching 
processes. 

2. Poisson-Dirichlet distributions. We will be concerned with properties 
of the two-parameter Poisson-Dirichlet distribution, introduced in its full 
generality in [20]. We will first define the PD(a,#) distribution and then 
mention some of its properties which we will use in the sequel. 
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Definition 2.1 (Stick-breaking scheme). For < a < 1 and 9 > -a, let 
Bi,I>2, ... be independent random variables such that B n ~ Beta(l — a, 9 + 
na) for all n > 1. Let 

Zi=Si and A > n = (l-Si)---(l- J B n _i)B„ for n > 2. 

Let Xi > X2 > • • • be the ranked values of the X n . Then define the PD(a, 9) 
distribution to be the law of the vector (X\,X2, ■ ■ .). 

The sequence (X±,X2, • • •) is a size-biased permutation of (X±,X2, • • •) and 
is said to have the Griffiths-Engen-McCloskey GEM(a,#) distribution. In 
particular, X\ is a size-biased pick from (X\,X2, ■■■)■ The next proposition 
is a direct consequence of Definition 2.1. 

Proposition 2.2 ([20]). Let (Y n , n > 1) ~ PD(a, 9 + a) and let B be an 
independent Beta(l — a, 9 + a) random variable. Let the sequence (X m ,m > 
1) be defined by inserting B into the sequence ((1 — B)Y n , n > 1) and rerank- 
ing. Then (X m ,m > 1) has PD(a,9) distribution and B is a size-biased pick 
from (X m , m>l). 

There are many representations of the PD(a,#) distribution. For a = 
and 9 > 0, we have Kingman's subordinator representation: 

Proposition 2.3 ([13]). Let 7 be a standard gamma subordinator on 
the time interval [0, 9], with ranked jumps £1 > £2 > • • • > 0. Then 

-L(£i,&,...)~PD(O,0) 

independently of "f(9), which has a Gamma(0, 1) distribution. 

A related subordinator representation holds for < a < 1 and 9 > 0: 

Proposition 2.4 ([20]). Fix < a < 1. Let (r(s),s > 0) be a subordina- 
tor with Levy measure ax~ a ~ 1 e~ x dx. Let S be an independent Gammas/a, 
r(l — a)) random variable and let £1 > £2 > • • ■ > be the ranked jumps of 
the subordinator in the time interval [0,5]. Then 

J-(£ 1 ,£ 2 ,...)~PD(a,0) 



independently ofr(S), which has a Gamma(#, 1) distribution. 
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3. Dual fragmentation and coagulation operators. Let < a < 1, 9 > —a 

and 

A^ = jx = (xt,x 2 , ■ . •) : x\ > x 2 > ■ ■ ■ > 0, Yl x i = 1 j- 

Let Frag a : — > A^ be a random operator which takes a size-biased pick 
from its input, splits it using an independent PD(a, 1 — a) random variable 
and then puts the resulting vector in decreasing order. More precisely, fix 
x € A^. Let I be an index chosen according to the distribution 

P(j = t) = Xi> »> 1, 

and let n = (771,772, • • •) ~ PD(a, 1 — a) independently of /. Then 

Frag Q (a;) = (x 1 ,x 2 , . . . , x/771, x/772, . . . , x I+2 , ■ ■ 

where here, as throughout this paper, the arrow used as a superscript on 
a sequence means that the sequence is to be put into decreasing order. Let 
Coag a g : A^ — » A^ be another random operator which picks a Beta((l — 
a) /a, (9-\-a)/a) proportion of the blocks if a > 0, or a deterministic propor- 
tion 1/(9 + 1) if a = 0, joins them together and puts the resulting vector in 
decreasing order. More precisely, if a > 0, let B ~ Beta((l — a) /a, (0 + a) /a), 
and if a = 0, let B = 1/(9 + 1). Let I\, I 2 , . . . be 0-1 random variables which, 
given B, are independent and identically distributed with Bernoulli(-B) law. 
Then 

Coag a>e (x) = Yj ( x i :/ i = °) • 

\i:ii=l / 

Theorem 3.1. Let < a < 1 and 9 > —a. Suppose that X and Y are 
random variables that take values in A^ . Then the following statements are 
equivalent: 

• X~PD(a,#) and, conditional on X , Y ~ Frag Q ,(X). 

• Y ~ PD(a, 9 + 1) and, conditional onY , X ~ Coag Q 

Proof. The a = case is Proposition 2 of [7] but, for completeness, 
we reproduce the proof here. Let ("f(t),t > 0) be a standard gamma process 
and let £1 > £ 2 > • • • > be the jumps of (7(f), < t < 9 + 1). Then, by 
Proposition 2.3, 

^^(fc,6,...)~PD(0,* + l) 

independently of 7(0 + 1), which has a Gamma(# + 1, 1) distribution. Now 
let > £ 2 > ■ • • > be the jumps of (7^), < t < 1) and let £" > > ■ ■ ■ > 
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be the jumps of (7(i), 1 < t < 9 + 1). Note that coagulating the jumps 
which happen in the time interval [0, 1] coagulates a proportion 1/(9 + 1) of 
£l,£2j •• ■ chosen uniformly at random. Then the following relationships hold 
independently: 

(1) ^j(£,&...)~pd(o,i), 

(2) 7( , +1 ;_ 7(1) (^^-)-pd(o,^ 

(3) -^~Beta( M ). 

The independence is a consequence of beta-gamma algebra and the fact 
that the jumps of a subordinator on disjoint time intervals are independent. 
From (1), it is clear that in the fragmentation step we split with PD(0, 1). 
Furthermore, (2), (3) and Proposition 2.2 then imply that 

-^L^( 7 (i) J ^,^,...)^~PD(o 1 e) 

and that r y(l)/"f(9 + 1) is a size-biased pick from this vector. 

Suppose now that a > 0. Then a variant of the same argument applies. Let 
(i~(t),t > 0) be a subordinator of Levy measure ax~ a ~ 1 e~ x dx and let S be 
an independent Gamma((# + l)/a,r(l — a)) random variable. Let B be an 
independent Beta((l — a) /a, (9 + a) /a) random variable. Suppose that £i > 
£2 > • • • > are the ranked jumps of (r(i), <t<S), that £,[ > £,' 2 > ■ ■ ■ > 
are the ranked jumps of (r(t), < t < BS) and that (i" > £2 > • • ■ >0 are the 
ranked jumps of (r(t),BS <t < S). Then, as 9 + 1 > 0, by Proposition 2.4, 

(&,&,...) ~PD(a, + 1) 



r(S) 

independently of t(S) = which has Gamma(6> + 1, 1) distribution. 

Now note that coagulating the jumps that occur in the interval [0,55] 
coagulates a proportion B of the jumps £1,^2, ■■■■ We have BS ~ 
Gamma((l — a)/a,r(l — a)) by standard beta-gamma algebra. Then the 
following relationships hold independently: 

(4) (£,&■•■) ~PD(a,l -a), 

(5) (q , 1 , R<? x (ff,& • ■ •) ~ PD(a, e + a), 
r(b) — t[Bd) 

(6) ^^~Beta(l-a,6> + a). 

t(S) 
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From (4), we see that in the fragmentation step we split with PD(a, 1 — a). 
From (5), (6) and Proposition 2.2, we see that 

(r(£SUM,...) 1 ~PD(a,0) 



t(S) 

r(S) 



and that -^P- is a size-biased pick from this vector. □ 



Remarks, (i) Corollary 13 of [18] gives a set of duality relations for cer- 
tain coagulation and fragmentation operators applied to Poisson-Dirichlet 
distributions. In particular, for 0<a<l,0</3<l and 9 > —a(3, PD(a/3, 9) 
is fragmented in such a way that each block is split with an independent 
PD(a,— a/3) random variable [call this (a, — o/3)-frag]. This results in a 
PD(a,#) random variable. In reverse, a coagulation of PD(a,#) which coag- 
ulates infinitely many different groups of blocks gives PD(a/3,#) back. The 
coagulation operator is a little more involved: suppose that the PD(a,0) 
random variable is Y = (Y\,Y2, ■ ■■)■ Take an independent random variable, 
Q = (Qi, Q2, ■ ■ •), with PD(/3, 9/a) distribution, and create an open subset 
Iq of [0, 1] composed of open intervals whose lengths are given by the vector 
Q: 

00 

I Q = (0,Q 1 )u\J(Q 1 + --- + Q l . 1 ,Q 1 + --- + Q i ). 

i=2 

Now throw independent U(0, 1) random variables U\,U2,--- down on the 
interval. Let d = {j > 1 : Uj < Qi} and, for i > 2, Q = {j > 1 : Uj £ (Qi + 
• • • + Qi-i,Qi + • • • + Qi)}- Finally, let X = (Xi,X2, . . .) be obtained by 
reranking the terms 

Xi = Y h i > !• 

Then X has a PD(a/3,0) distribution. We denote by 0/a)-COAG the 
operation on the vector Y which produces X. 

Theorem 3.1 provides a companion set of relations (see Figure 1). While 
Pitman's relations affect the first parameter multiplicatively, our relations 
affect the second parameter additively. 

(ii) For a = 1/2 and 9 = n — 1/2, n > 1, Jim Pitman has pointed out 
that the operation of splitting a size-biased pick from PD(a,#) according to 

(a. -afi)-rHM; Frn K„ 



PD(a0,0) PD(a,0) PD(a,0) PD(or,<3 + l) 



Fig. 1. Left: Pitman's duality relations. Right: Theorem 3.1. 
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PD(a, 1 — a) can be interpreted in terms of the continuum random tree T 
embedded in a Brownian excursion, as follows. Let R n be the subtree of T 
spanned by n points picked at random according to the mass measure fi of 
the tree (corresponding to Lebesgue measure on [0, 1]). Let fj, n be the image 
of jjl on R n via the map which takes a point t 6 T to its closest point in 
R n . It follows from the line-breaking construction of R n in [1] that \i n is a 
random discrete distribution whose ranked atoms are distributed according 
to PD(l/2,n — 1/2) and that, in the growth step from R n to R n +i, a size- 
biased choice of one of these atoms is split according to PD(l/2, 1/2) to 
create the atoms of The inverse coagulation operation can also be 

seen in this setting as a corollary of Aldous's results. It appears that similar 
interpretations in terms of continuum trees for other values of (a, 9) can be 
based on Section 5.3 of [12]. Such interpretations are the subject of work in 
progress by Pitman and Winkel. (See [2, 3, 19] for further background.) 

4. Fragmentation and coagulation processes. Define a discrete-time Mar- 
kov fragmentation chain (X(i),i > 0) that takes values in as follows: for 
i > 0, conditional on X(i), 

X(i + l)~Frag a (X(i)). 

If X(0) ~PD(a,0), then, by Theorem 3.1, X{i) ~ PD(a, 9 + i) for i > 1. 
Likewise, define an inhomogeneous Markov coagulation chain by 

...,x(i + i),x(i),...,x(i),x(o). 

By Theorem 3.1, conditional on X{i + 1), 

X(i)~Coag a>e+l (X( i + l)) 

for i > 0. 

We can construct a continuous-time Markov fragmentation process (Y(t), 
t > 0) by taking an independent standard Poisson process (N(t),t > 0) and 
letting 

Y(t)=X(N(t)) 

for t > 0. In the terminology of Bertoin [6], this is a self-similar fragmentation 
of index 1, erosion coefficient and dislocation measure PD(a, 1 — a). Since 
the dislocation measure is finite (it is a probability measure), the fact that 
the fragmentation has index of self-similarity 5 just means that a block of 
size x splits at a rate proportional to x s , and independently of the other 
blocks. Since here the total rate of splitting is 1 at any time, and we split a 
size-biased pick from among the blocks, we must have each block splitting 
at the rate of its length, that is, 5 = 1. 

Suppose that we fix 9. In the a = case, it was shown in [7] that a dual 
Markovian coagulation chain may be defined by (Y(e ),t > 0). It can be 
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checked that when the state has distribution PD(0,# + n), for any n > 1, 
the process waits an exponential time of parameter n and then jumps to 
a state distributed as PD(0,# + n — 1). In principle, the same construction 
may be performed in the a > case. However, here the inhomogeneity of the 
discrete-time coagulation chain becomes a problem. In the a = case, the 
distributions PD(0,#) are almost surely distinguishable as 9 varies. Thus, it 
is possible to tell from the current state which coagulation operator to apply 
to it to get the next state. In the a > case, however, the distributions 
PD(a,#) are mutually absolutely continuous as 9 varies. Thus, it is not 
possible to detect almost surely from the state what the second parameter 
is and then work out which coagulation operator to apply. 



Remarks. In [7], the a = case of these processes is shown to arise nat- 
urally in the context of the genealogy of certain continuous-state branching 
processes. We do not see any way to generalize those results to the case 
a > 0. 



5. Random recursive trees. Let a € [0, 1) and 9 > —a. An exchangeable 
(a, 9) partition of N (or of any infinite subset ACN) is defined as follows: 

(i) Generate a PD(a, #)-distributed vector (Yi, Yz, Y3, . . .). 

(ii) Conditionally on Y±, Y2, Y3, . . . , assign i to block j with probability 
Yj, independently for each i € A. 

Suppose we order the blocks of this partition in increasing order of their 
smallest elements. Let Bi be the ith block and let Fi be its asymptotic 
frequency, that is, 

|^n{l,2,...,n}| 

bi = Inn 



% 

n— >oo 



An{l,2,...,n}| 



Then (F\, F2, . . .) is a size-biased ordering of (Y\, Y2, . . .). In particular, (F\, F2, 
. . .) has the GEM(a,0) distribution. 

An alternative way to construct an exchangeable (a, 9) partition is via 
the Chinese restaurant process of Dubins and Pitman (see [19]), defined as 
follows. Person 1 enters a Chinese restaurant and sits at the first table. 
Person 2 sits either at the same table or at a new one; in general, each 
subsequent person (numbered successively 3,4, . . .) sits either at one of the 
occupied tables or at a new one. Suppose that people 1,2, ...,n have sat 
at k tables, where table i has n% customers (with n% > 1 and Ya=i Tii = n). 
Then person n + 1 starts a new table with probability 

9 + ka 
n + 9 
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and sits at table i with probability 

rii — a 
n + 9 

for 1 < i < k (see Figure 2). The partition of N into blocks that correspond 
to the different tables is an exchangeable (a, 9) partition of N. (Of course, 
the construction for general A C N is analogous.) 

We are now ready to describe the model of an (a, 9) -recursive tree. A 
recursive tree is a rooted labeled tree such that the vertex labels increase 
along paths away from the root. We now construct a random recursive tree 
as follows. We start with the root labeled and a single child labeled 1. 
Vertices 2,3,... are now added in turn; vertex i is added as a child of one of 
the existing vertices 0, 1, . . . ,i — 1. When vertex i is added, the probability 
that it is added as a child of vertex j is proportional to 1 — a + akj if j > 1, 
where kj is the current number of children of vertex j. It is added as a child 
of vertex with probability proportional to 9 + ako. A tree on Z + that arises 
in this way is called an (a, #)-recursive tree. 

It is useful to generate the same distribution via a continuous-time Markov 
chain. Namely, we again start with the root labeled and a single child 
labeled 1. From then on, new children of the root arrive at rate 9 + ako and 
new children of vertex j, j > 1, arrive at rate 1 — a + akj (where, again, 
kj is the current number of children of vertex j). The new vertices are 
numbered in the order they arrive. This continuous-time construction will 
make it possible to deduce directly various useful independence properties 
of the tree. 

Now consider the following procedure. Remove vertex and record the 
partition of N given by the resulting forest. Call this partition B'°) , where the 
blocks b[°\b(°\... are listed in increasing order of smallest element (note 
that this smallest element is necessarily the root of one of the recursive trees 
in the forest). Now, for i > 1, define BW to be the partition of N\ {1, 2, . . . , i} 

obtained by removing vertices 0, 1,2, . . . , i (again, the blocks ... of 

this partition are listed in increasing order of smallest element). 
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(i) (i) 

Theorem 5.1. For all i > 0, the blocks B± ,B\ , . . . form an exchange- 
able (a,9-\-i) partition of N \ {1,2, . . . , i}. In particular, Bf'^B^ pos- 
sess asymptotic frequencies F^\ F% j , . . . such that (i^W)J- e and F^> ~ 
GEM(a,0 + i) /or aZZ i > 0. Moreover, letting = (F^, we /iaue that 
is a Markov chain such that the following statements hold: 

• For i > 1, conditional on G^, G^" 1 " 1 -* Zias f/ie same distribution as 
FragJGW). 

• For i > 1, conditional on G^ l+1 \ has the same distribution as 
Coag^G^ 1 )). 

This result is illustrated in Figure 3. It entails that as long as X(0) ~ 
PD(a,0), then 

(GW,i>0) = (X(i),i>0), 

where > 0) is the Markov fragmentation chain of Section 4, and 

clearly implies Theorem 3.1. The size-biased view of the fragmentation chain 
given by the random recursive tree seems a very natural description: rather 
than having two sources of external randomness (one to take a size-biased 
pick from the state vector and another to split it), here the randomness is 
entirely in the tree; given the tree, the fragmentation is deterministic. The 
tree can be thought of as a concrete representation of the filtration of the 
fragmentation. 

Remarks, (i) For a general survey of results on recursive trees, see [22]. 
Random recursive trees similar to certain (a, #)-recursive trees have been 
studied by Szymariski [23] and by Mahmoud, Smythe and Szymahski [14]. 
One object of interest is the kth branch, that is, the subtree rooted at k 
in a random recursive tree labeled by {0, 1, . . . ,n}. Call the size of the A:th 
branch T n ^. In our model, F^ k ^ is the almost sure limit of T n ^/n as 
n — > oo. In his Theorem 8, Szymahski [23] finds the mean and variance of 
T n £ in the (1/2, 0)-recursive tree; his results are consistent with the limiting 
mean and variance implied by Theorem 5.1. Theorem 5 of [14] gives the lim- 
iting distribution of T n y~jn as Beta(l/2, k) in a model which is essentially 
the (1/2, l/2)-recursive tree; this is also what we expect from Theorem 5.1. 
Various related classes of random graphs and trees constructed by prefer- 
ential attachment are considered, for example, by Barabasi and Albert [5], 
Bollobas, Riordan, Spencer and Tusnady [9], Mori [15] and Rudas, Toth and 
Valko [21]. 

(ii) Taking a = —1/m and 9 = r/m for integers m > 1 and r > 2 in the 
Chinese restaurant process gives an exchangeable random partition into r 
blocks whose asymptotic frequencies are the decreasing rearrangement of a 
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Fig. 3. Fragmentation and coagulation for the (a, 0) -recursive tree. 

Dirichlet(l/m, . . . , 1/m) random vector with r parameters all equal to 1/m. 
The construction of the (a, #)-recursive tree works for these parameters as 
well [to be precise, the probability we add a vertex to the root is proportional 
to (r — ko)j r /m and the probability we add to any other vertex is proportional 
to (m + 1 — kj) + /m], giving a tree whose out-degrees are all equal to m + 1, 
except for the root which has degree r. Analogs of Theorems 3.1 and 5.1 
hold in this case too; see [7] for details. 

6. Proof of Theorem 5.1. We prove Theorem 5.1 in a series of lemmas. 
(Note that this proof is independent of Theorem 3.1.) 

Lemma 6.1. The blocks B±\b%\... form an exchangeable (a,9 + i) 
partition of N\ {1, . . . 
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Proof. By the children of a set of vertices, we will mean here vertices 
which are neighbors of that set in the tree, but are not contained within it. 
We work via the Chinese restaurant process. Think of children of the set of 
vertices {0,1,2, ... ,i} as starting new tables. Consider the construction of 
the (a, #)-recursive tree and suppose that vertices labeled i + l,i+2, ... ,i+n 
have already arrived, including k children of the set {0,1,2, ... Suppose 
that the k subtrees rooted at these children have sizes ri\,n<i, ■ ■ ■ ,n^, where 
J2j=i n j = n (the subtrees correspond to tables in the Chinese restaurant 
process). Then i + n + 1 forms a new table at rate 9 + i(l — a) + a(k + i) = 
9 + i + ak and is added to table j at rate nj(l — a) + a(nj — 1) = iij — a for 
1 < j < k. Hence, the total rate is 2~Zj=i n j~ ka + 9 -\-i-\-ak = n + 9 + i. Thus, 
i + n + 1 forms a new table with probability (9 + i + ak) /(n + 9 + i) and 
adds to table j with probability (rij — a)/(n + 9 + i) for 1 < j < k. Hence, we 
have a Chinese restaurant process of parameters (a, 9 + i) and so removing 
vertices 0,1,2, ... ,i gives blocks which form an (a, 9 + i) partition. □ 

We now describe the (a, #)-recursive tree in terms of a set of "nested" 
{a, 9) and (a, 1 — a) partitions. 

Lemma 6.2. The distribution of the (a, 9) -recursive tree is characterized 
(among distributions on recursive trees on Z + ) by the following properties: 

(i) The blocks that correspond to subtrees rooted at different children of 
the root form an exchangeable (a, 9) partition o/N. 

(ii) Let i > 1. Consider the set D(i) CN \ {1,2, . . . , i} of labels which are 
descendants of i in the tree. Conditional on D(i), the blocks that correspond 
to subtrees rooted at different children of i form an exchangeable (a, 1 — a) 
partition of D(i) and this partition is independent of the structure of the 
subtree on Z + \ D(i). 

One could say that the subtree rooted at any vertex except the root has 
the structure of an (a, 1 — a)-recursive tree; in the special case 9 = 1 — a, 
the whole tree also has this structure and so one has full self-similarity. 

Proof of Lemma 6.2. The partition of N into blocks as in part (i) 
determines the labels of the children of the root (because of the property 
that labels increase along paths away from the root, so that the children 
of the root are the smallest elements of each block of the partition). Then 
the subpartitions of the blocks (minus their smallest elements) as in part 
(ii) determine the next level of the tree and so on; thus we indeed have a 
characterization of the distribution of the tree. 

Property (i) is the i = case of Lemma 6.1. The self-similarity and in- 
dependence properties in (ii) are most easily seen via the continuous-time 
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construction of the tree. Considering only the subtree rooted at vertex i, 
we see that the process of vertices arriving at that subtree corresponds pre- 
cisely to that for building an (a, 1 — a) tree (with a different set of labels); its 
structure depends on the evolution of the rest of the tree only through the 
sequence of labels which are assigned to new vertices that join the subtree. 
□ 

The following alternative procedure also builds the (a, #)-recursive tree 
and in addition will describe precisely the evolution of the process 
0) obtained by removing the vertices 0,1,2,... in turn. This construction 
employs the self-similarity and naturally gives the Markov property of the 
fragmentation process, which is not easily obtained by other means. 

At stage n of the procedure we have a tree with n + 1 internal vertices 
labeled 0, 1,2, ... ,n, each of which has infinitely many children which are 
leaves. At each stage we label one of the leaves and create new leaves which 
are children of the newly labeled vertex (which, of course, ceases to be a 
leaf itself). Each vertex of the tree carries a weight. The weight of a vertex 
represents the asymptotic frequency of the set of all the labels which will be 
assigned to the descendants of that vertex. In particular: 

• The weight of the root is 1. 

• For any internal (i.e., already labeled) vertex, the weight of the vertex 
equals the sum of the weights of all its children. [In fact, the weights of 
the children are a splitting, by a PD(a, 1 — a) random vector, of the weight 
of the parent, unless the parent is the root, in which case the splitting is 
byPD(a,0).] 

We start at stage with a tree that consists of the root, labeled 0, with 
weight 1, and an infinite number of unlabeled leaves which are children of 
the root and whose weights are given by a PD(a,#) random vector. 

To pass to stage 1, we choose one of the children of the root in a size- 
biased way (according to the weights assigned to the vertices) and assign 
label 1 to this vertex. We create an infinite number of children of the newly 
labeled vertex (which become leaves); these are assigned weights given by a 
splitting of the weight of the parent vertex by a PD(a, 1 — a) random vector. 

To pass from stage n to stage n + 1, we similarly choose between the 
children of the root in a size-biased way. If we choose an already labeled 
vertex, we now choose between the children of that chosen vertex in a size- 
biased way. This continues until we reach a leaf. This leaf is now labeled 
n+ 1; its children are created and their weights are assigned by a PD(a, 1 — a) 
splitting as above. 

All of the size-biased picks and random vectors are independent. 

Proceeding in this way, one constructs a tree on 0,1,2,.... From the 
description of the random recursive tree in Lemma 6.2 in terms of nested 
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exchangeable (a, 9) and (a, 1 — a) partitions, it follows that this tree indeed 
has the distribution of an (a, #)-recursive tree. 

We can now use this construction to identify the chain (G^ , i > 0) with 
the chain obtained by applying Prag a repeatedly. 

Lemma 6.3. The process is a Markov chain such that con- 

ditional on GW, G^ +1 ) has the same distribution as Frag a (G^). 

Proof. First note that the weights of all the leaves at stage i corre- 
spond to the state GW, recording the asymptotic frequencies of the sub- 
trees obtained when the vertices 0,1,..., i are removed and ranking these 
frequencies in decreasing order. The procedure of passing from stage i to 
i + 1 — choosing first between the children of the root in a size-biased way, 
then between the children of that child in a size-biased way and so on un- 
til a leaf is chosen — is equivalent simply to choosing between all the leaves 
in a size-biased way. Given the state of the tree at stage i, this choice of 
vertex i + 1 is made independently of previous choices. Thus, given G«, 
we obtain G^ i+1 ' precisely by applying the random operator Frag a to G^> 
independently of , . . . , G^ . □ 

We have proved that (G^ , i > 0) is a Markov chain and that G^ l+1 ^ is the 
required fragmentation of G^ . It remains only to show the coagulation prop- 
erty. To understand the coagulation mechanism in the tree, we need some 
more notation. For i > 0, starting from blocks B^ +l \ B^ +1 \ • • • , create new 



blocks B\ ,B% as follows. Let l\ be an independent Bernoulli((l 

a)/ {6 + i + 1)) random variable and recursively g enerate ^ , -^3 > • 
that take values or 1 via 

(7\ p/F^ 1 ) -llf^ 1 ) ?(*+!) - 1 ~ - + a ^ =1 *S t+1) 

[n m k+1 - imi ,h )- e+l + l + ak 

independently of B^ i+1 \ Let = {k:i^ +l) = 1}, let 



BP = {i + 1} U |J B { ^ +1) 



and let JbI , Bi , ... be the blocks not contained in the union 

listed in increasing order of smallest element (notice that this retains the 

ordering of the indices from the previous step). Let B« = (B^\b^,...). 

Lemma 6.4. For i>\, B^> is a partition of N \ {1, 2, . . . , i] and 
(B^,B^) = (B^,B^). 
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Proof. Suppose that we are given B^ l+1 \ We know that in the full 
recursive tree, each of the blocks B^ +1 \ B 2 +1 \ ■ ■ ■ is connected to one of 
the vertices 0, 1, . . . , i + 1. We need to know which of them are connected to 
i + 1. 

Let (01,02,03, . . .) be the sequence of children of the nodes {0, 1, . . . , i + 1} 
in order. (In particular, oi = i + 2.) Let (61,621 • • •) be {i + 2,i + 3, . . .} \ 
{ai, 02 . . .} in order. Now for j > 1, let pj be the parent of node aj and let qj 
be the parent of node bj. (Thus pj £ {0, 1, . . . , i+1} and qj € {i + 2, i + 3, . . .}.) 
From the continuous-time construction of the random recursive tree, one can 
deduce that the three sequences (01,02,...), (pi,P2,--0 and (<7i, <72j ■ • ■) are 
independent. Since £?(' t+1 ) is a function of (01,02,...) and (gi, q2, ■ ■ ■ ), we 
therefore have that (pi,P2, ■ • ■) is independent of B^ %+1 \ 

Since pj is the vertex to which block Bj Z+1 ^ is attached, we need to know 
specifically which of the pj are equal to i + 1. Now, regardless of the form 
of the subtree spanned by vertices 0, 1, . . . , i + 1, vertex i + 1 has children at 
rate 1 — a + am+i, where ri£+i is the number of children that it has already, 
and the group 0, 1, . . . , i has children at total rate 6 + i + a + an, where h is 
the combined total number of children that the group already has other than 
i + 1. Hence, the probability that p\ = i + 1 is equal to (1 — a)/(6 + i + 1). For 
k > 1, let be the indicator function of the event = i + 1}, which of 

course is the same as the event {B^ +1 ^ is attached to i + 1}. Then for k > 1 
we have 



r(i+l) 

r l J jfc+l — J+l ; J 2 '•••' J fc J 



r (i+l) _ ! 1 r (i+l) r (i+l) ^ _ 1 - « + « E J= i ^ 



6» + i + 1 + ak 

so that the sequence {1^ ,1% ,...) has the same law as the sequence 

(lf +1) ,I 2 (i+1) ,.--) constructed at (7). Thus, conditional on B^ i+l \ B® = 
J3W. □ 

The proof of Theorem 5.1 is now completed by the following lemma. 

Lemma 6.5. Conditional on G^ t+1 \ has the same distribution as 
Goag^CG^ 1 )). 



Proof. Consider the construction of B^ from B^ l+1 \ The random vari- 
ces I { 2 i+l \ . . . defined at (7) 
Polya urn." In particular, the limit 



ables Ii , I2 1 , • • ■ defined at (7) describe the evolution of a "generalized 



B= lim tY^ +1) 



i=i 
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exists almost surely and has a Beta((l — a) /a, (9 + i + a) /a) distribution 
[except in the case a = 0, when B = 1/(9 + i + 1) a.s.]; then, conditional on 

B, the variables I^ 1 ^, , ■ ■ ■ are independent and identically distributed 
Bernoulli(-B) random variables (see [8] or [17]). 

Thus, conditional on B^ %+1 \ the ranked asymptotic frequencies of 
£?W have the same distribution as Coag Q Since this distribution 

depends on only through G^ l+l \ we have that the same is true if we 

condition instead on G^ +1 ^ (this follows by Dynkin's criterion for a function 
of a Markov process to be Markov; see Theorem 10.13 of [11]). But by 

Lemma 6.4, we have (B®, B^) = so also (G^,G^) = 

(G«,G( i+1 )). Hence, conditional on G^ i+1 \ G® has the same distribution 
asCoag^G^ 1 )). □ 

7. Continuous-time branching models. Pitman [16], gave a construction 
of the two-parameter partition structure [or, equivalently, the (a, 9) Chinese 
restaurant process] via continuous-time branching models (possibly with im- 
migration). In this section we discuss the relationship between such a con- 
struction and our recursive tree model. 

We first describe Pitman's construction. Let < a < 1, 9 > —a. Consider 
a population of individuals of two types: novel and clone. Each individual 
is assigned a color: a novel individual always has a new color, while a clone 
is always assigned the same color as its parent. Every individual has an 
infinite lifetime. Starting from a single novel individual at time t = 0, this 
first individual produces novel offspring according to a Poisson process of 
rate 9 + a and clone offspring according to an independent Poisson process 
of rate 1 — a. The reproduction rules for the other individuals are as follows: 

• Novel individuals produce novel offspring according to a Poisson process 
of rate a and independently produce clone offspring according to a Poisson 
process with rate 1 — a. 

• Clone individuals produce clone offspring according to a Poisson process 
of rate 1. 

Individuals are labeled in the order they appear. The colors of individu- 
als naturally induce a random partition of N, which has the PD(a,#) dis- 
tribution, by comparison of the growth procedure with the (a, 9) Chinese 
restaurant process. 

If 9 > 0, we may treat the first individual just like any other novel indi- 
vidual (i.e., producing novel individuals at rate a and clones at rate 1 — a) 
and introduce an independent Poisson migration process of novel individu- 
als which arrive at rate 9. This way of looking at things provides an easy 
way to see the fact (due to Pitman [18]) that taking a PD(0,#) random vari- 
able and splitting each block with an independent PD(a, 0) random variable 
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gives a PD(a,#) random variable. Indeed, by ignoring the clone/novel dif- 
ference and just looking at the partition generated by the descendencies of 
the immigrant individuals, we see a (0, 9) partition. If we then keep track of 
the different colors as well, we see a refinement of this partition. Moreover, 
within each block of the coarse partition, the colors are generated according 
to the rules for an (a, 0) partition. 

Let us explain where the branching model construction of an (a, 9) par- 
tition (in the no immigration setting) differs from the recursive tree con- 
struction. Novel individuals in the branching model are exactly the children 
of the root in the recursive tree (although which novel individual is the 
"child" of which other novel individual has no meaning in the recursive tree 
setup). The growth rates for the subfamilies are the same: the collection 
consisting of a novel individual and its clone descendents (say k of them) 
produces new clone individuals at rate 1 — a + k. In the recursive tree, like- 
wise the subtree descending from a particular child of the root grows at 
rate (1 — a)(k + 1) + ak = 1 — a + k. However, the new individuals are not 
being added in the same places on the subtree. For example, a novel indi- 
vidual in Pitman's construction has clone children at fixed rate 1 — a. In our 
subtree, however, the individual at the top of the subtree has children at 
rate 1 — a + (^{children it has already had}. These genealogical differences 
make no difference to the partition obtained, but do make a difference to 
the nesting property for successive partitions. 

The coagulation-fragmentation duality (Theorem 3.1) for < a < 1, 9 > 
—a can be interpreted through a variant of the branching model construc- 
tion. We introduce killing, as in the recursive tree, and also give the clones 
hidden features. These hidden features enable us to obtain the required 
nesting of partitions. More precisely: 

• Each clone individual is different from its brothers and has a brand new 
color, but this difference (in type as well as in color) is invisible until its 
parent is killed. 

• Each clone individual actually generates novel individuals at rate a and 
independently generates clone individuals at rate 1 — a, but this difference 
(in type as well as in color) among its offspring is invisible until its parent 
is killed and it becomes novel. 

As in the recursive trees setting, start with a PD(a,#) population and kill 
the first individual, then the second, then the third and so forth. Notice that 
immediately after killing an individual, its clone children and the offspring of 
those clone children may change their novel/clone status or color and form 
a number of new blocks which fragment the block of the killed individual. 
This provides an alternative encoding of the fragmentation chain. 
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